238 research outputs found

    Convex recovery of tensors using nuclear norm penalization

    Full text link
    The subdifferential of convex functions of the singular spectrum of real matrices has been widely studied in matrix analysis, optimization and automatic control theory. Convex analysis and optimization over spaces of tensors is now gaining much interest due to its potential applications to signal processing, statistics and engineering. The goal of this paper is to present an applications to the problem of low rank tensor recovery based on linear random measurement by extending the results of Tropp to the tensors setting.Comment: To appear in proceedings LVA/ICA 2015 at Czech Republi

    Scalable Bayesian Non-Negative Tensor Factorization for Massive Count Data

    Full text link
    We present a Bayesian non-negative tensor factorization model for count-valued tensor data, and develop scalable inference algorithms (both batch and online) for dealing with massive tensors. Our generative model can handle overdispersed counts as well as infer the rank of the decomposition. Moreover, leveraging a reparameterization of the Poisson distribution as a multinomial facilitates conjugacy in the model and enables simple and efficient Gibbs sampling and variational Bayes (VB) inference updates, with a computational cost that only depends on the number of nonzeros in the tensor. The model also provides a nice interpretability for the factors; in our model, each factor corresponds to a "topic". We develop a set of online inference algorithms that allow further scaling up the model to massive tensors, for which batch inference methods may be infeasible. We apply our framework on diverse real-world applications, such as \emph{multiway} topic modeling on a scientific publications database, analyzing a political science data set, and analyzing a massive household transactions data set.Comment: ECML PKDD 201

    Identifying and Alleviating Concept Drift in Streaming Tensor Decomposition

    Full text link
    Tensor decompositions are used in various data mining applications from social network to medical applications and are extremely useful in discovering latent structures or concepts in the data. Many real-world applications are dynamic in nature and so are their data. To deal with this dynamic nature of data, there exist a variety of online tensor decomposition algorithms. A central assumption in all those algorithms is that the number of latent concepts remains fixed throughout the entire stream. However, this need not be the case. Every incoming batch in the stream may have a different number of latent concepts, and the difference in latent concepts from one tensor batch to another can provide insights into how our findings in a particular application behave and deviate over time. In this paper, we define "concept" and "concept drift" in the context of streaming tensor decomposition, as the manifestation of the variability of latent concepts throughout the stream. Furthermore, we introduce SeekAndDestroy, an algorithm that detects concept drift in streaming tensor decomposition and is able to produce results robust to that drift. To the best of our knowledge, this is the first work that investigates concept drift in streaming tensor decomposition. We extensively evaluate SeekAndDestroy on synthetic datasets, which exhibit a wide variety of realistic drift. Our experiments demonstrate the effectiveness of SeekAndDestroy, both in the detection of concept drift and in the alleviation of its effects, producing results with similar quality to decomposing the entire tensor in one shot. Additionally, in real datasets, SeekAndDestroy outperforms other streaming baselines, while discovering novel useful components.Comment: 16 Pages, Accepted at ECML-PKDD 201

    Tensor Product Approximation (DMRG) and Coupled Cluster method in Quantum Chemistry

    Full text link
    We present the Copupled Cluster (CC) method and the Density matrix Renormalization Grooup (DMRG) method in a unified way, from the perspective of recent developments in tensor product approximation. We present an introduction into recently developed hierarchical tensor representations, in particular tensor trains which are matrix product states in physics language. The discrete equations of full CI approximation applied to the electronic Schr\"odinger equation is casted into a tensorial framework in form of the second quantization. A further approximation is performed afterwards by tensor approximation within a hierarchical format or equivalently a tree tensor network. We establish the (differential) geometry of low rank hierarchical tensors and apply the Driac Frenkel principle to reduce the original high-dimensional problem to low dimensions. The DMRG algorithm is established as an optimization method in this format with alternating directional search. We briefly introduce the CC method and refer to our theoretical results. We compare this approach in the present discrete formulation with the CC method and its underlying exponential parametrization.Comment: 15 pages, 3 figure

    Error Analysis of TT-Format Tensor Algorithms

    Get PDF
    The tensor train (TT) decomposition is a representation technique for arbitrary tensors, which allows efficient storage and computations. For a d-dimensional tensor with d 65 2, that decomposition consists of two ordinary matrices and d 12 2 third-order tensors. In this paper we prove that the TT decomposition of an arbitrary tensor can be computed (or approximated, for data compression purposes) by means of a backward stable algorithm based on computations with Householder matrices. Moreover, multilinear forms with tensors represented in TT format can be computed efficiently with a small backward error

    An Iterative Model Reduction Scheme for Quadratic-Bilinear Descriptor Systems with an Application to Navier-Stokes Equations

    Full text link
    We discuss model reduction for a particular class of quadratic-bilinear (QB) descriptor systems. The main goal of this article is to extend the recently studied interpolation-based optimal model reduction framework for QBODEs [Benner et al. '16] to a class of descriptor systems in an efficient and reliable way. Recently, it has been shown in the case of linear or bilinear systems that a direct extension of interpolation-based model reduction techniques to descriptor systems, without any modifications, may lead to poor reduced-order systems. Therefore, for the analysis, we aim at transforming the considered QB descriptor system into an equivalent QBODE system by means of projectors for which standard model reduction techniques for QBODEs can be employed, including aforementioned interpolation scheme. Subsequently, we discuss related computational issues, thus resulting in a modified algorithm that allows us to construct \emph{near}--optimal reduced-order systems without explicitly computing the projectors used in the analysis. The efficiency of the proposed algorithm is illustrated by means of a numerical example, obtained via semi-discretization of the Navier-Stokes equations

    Expert recommendation via tensor factorization with regularizing hierarchical topical relationships

    Full text link
    © Springer Nature Switzerland AG 2018. Knowledge acquisition and exchange are generally crucial yet costly for both businesses and individuals, especially when the knowledge concerns various areas. Question Answering Communities offer an opportunity for sharing knowledge at a low cost, where communities users, many of whom are domain experts, can potentially provide high-quality solutions to a given problem. In this paper, we propose a framework for finding experts across multiple collaborative networks. We employ the recent techniques of tree-guided learning (via tensor decomposition), and matrix factorization to explore user expertise from past voted posts. Tensor decomposition enables to leverage the latent expertise of users, and the posts and related tags help identify the related areas. The final result is an expertise score for every user on every knowledge area. We experiment on Stack Exchange Networks, a set of question answering websites on different topics with a huge group of users and posts. Experiments show our proposed approach produces steady and premium outputs

    Approximating turbulent and non-turbulent events with the Tensor Train decomposition method

    Get PDF
    Low-rank multilevel approximation methods are often suited to attack high-dimensional problems successfully and they allow very compact representation of large data sets. Specifically, hierarchical tensor product decomposition methods, e.g., the Tree-Tucker format and the Tensor Train format emerge as a promising approach for application to data that are concerned with cascade-of-scales problems as, e.g., in turbulent fluid dynamics. Beyond multilinear mathematics, those tensor formats are also successfully applied in e.g., physics or chemistry, where they are used in many body problems and quantum states. Here, we focus on two particular objectives, that is, we aim at capturing self-similar structures that might be hidden in the data and we present the reconstruction capabilities of the Tensor Train decomposition method tested with 3D channel turbulence flow data

    WTEN: An advanced coupled tensor factorization strategy for learning from imbalanced data

    Full text link
    © Springer International Publishing AG 2016. Learning from imbalanced and sparse data in multi-mode and high-dimensional tensor formats efficiently is a significant problem in data mining research. On one hand,Coupled Tensor Factorization (CTF) has become one of the most popular methods for joint analysis of heterogeneous sparse data generated from different sources. On the other hand,techniques such as sampling,cost-sensitive learning,etc. have been applied to many supervised learning models to handle imbalanced data. This research focuses on studying the effectiveness of combining advantages of both CTF and imbalanced data learning techniques for missing entry prediction,especially for entries with rare class labels. Importantly,we have also investigated the implication of joint analysis of the main tensor and extra information. One of our major goals is to design a robust weighting strategy for CTF to be able to not only effectively recover missing entries but also perform well when the entries are associated with imbalanced labels. Experiments on both real and synthetic datasets show that our approach outperforms existing CTF algorithms on imbalanced data
    • …
    corecore